Prioritizing candidate disease genes by network-based boosting of genome-wide association data.
نویسندگان
چکیده
Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.
منابع مشابه
GenePANDA—a novel network-based gene prioritizing tool for complex diseases
Here we describe GenePANDA, a novel network-based tool for prioritizing candidate disease genes. GenePANDA assesses whether a gene is likely a candidate disease gene based on its relative distance to known disease genes in a functional association network. A unique feature of GenePANDA is the introduction of adjusted network distance derived by normalizing the raw network distance between two g...
متن کاملGWAB: a web server for the network-based boosting of human genome-wide association data
During the last decade, genome-wide association studies (GWAS) have represented a major approach to dissect complex human genetic diseases. Due in part to limited statistical power, most studies identify only small numbers of candidate genes that pass the conventional significance thresholds (e.g. P ≤ 5 × 10-8). This limitation can be partly overcome by increasing the sample size, but this come...
متن کاملgenome - wide association data Prioritizing candidate disease genes by network - based boosting of
Material Supplemental http://genome.cshlp.org/content/suppl/2011/04/28/gr.118992.110.DC1.html P<P Published online May 2, 2011 in advance of the print journal. Preprint Accepted likely to differ from the final, published version. Peer-reviewed and accepted for publication but not copyedited or typeset; preprint is service Email alerting click here top right corner of the article or Receive free...
متن کاملeResponseNet: a package prioritizing candidate disease genes through cellular pathways
MOTIVATION Although genome-wide association studies (GWAS) have found many common genetic variants associated with human diseases, it remains a challenge to elucidate the functional links between associated variants and complex traits. RESULTS We developed a package called eResponseNet by implementing and extending the existing ResponseNet algorithm for prioritizing candidate disease genes th...
متن کاملGenome-wide haplotype association study identify TNFRSF1A, CASP7, LRP1B, CDH1 and TG genes associated with Alzheimer's disease in Caribbean Hispanic individuals
Alzheimer's disease (AD) is an acquired disorder of cognitive and behavioral impairment. It is considered to be caused by variety of factors, such as age, environment and genetic factors. In order to identify the genetic affect factors of AD, we carried out a bioinformatic approach which combined genome-wide haplotype-based association study with gene prioritization. The raw SNP genotypes data ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 21 7 شماره
صفحات -
تاریخ انتشار 2011